Projected sequential Gaussian processes: A C++ tool for interpolation of large datasets with heterogeneous noise
نویسندگان
چکیده
Heterogeneous data sets arise naturally in most applications due to the use of a variety of sensors, and measuring platforms. Such data sets can be heterogeneous in terms of the error characteristics, and sensor models. Treating such data is most naturally accomplished using a Bayesian or model based geostatistical approach, however such methods generally scale rather badly with the size of data set, and require computationally expensive Monte Carlo based inference. Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential Bayesian framework for inference in such projected processes is presented. The observations are considered one at a time which avoids the need for high dimensional integrals typically required in a Bayesian approach. A C++ library, gptk, which is part of the INTAMAP web service, is introduced which implements projected, sequential estimation and adds several novel features. In particular the library includes the ability to use a generic observation operator, or sensor model, to permit data fusion. It is also possible to cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the covariance parameters is explored, including the impact of the projected process approximation on likelihood profiles. We illustrate the projected sequential method in application to synthetic and real data sets. Limitations and extensions are discussed.
منابع مشابه
Projected Sequential Gaussian Processes: A C++ tool for interpolation of heterogeneous data sets
Within MUCM there might occasionally arise the need to use large training set sizes, or employ observations with non-Gaussian noise characteristics or non-linear sensor models in a calibration stage. This technical report deals with Gaussian process models in these non-Gaussian, and / or large data set size cases. Treating such data within Gaussian processes is most naturally accomplished using...
متن کاملPresentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition
Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...
متن کاملPresentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition
Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...
متن کاملGPatt: Fast Multidimensional Pattern Extrapolation with Gaussian Processes
Gaussian processes are typically used for smoothing and interpolation on small datasets. We introduce a new Bayesian nonparametric framework – GPatt – enabling automatic pattern extrapolation with Gaussian processes on large multidimensional datasets. GPatt unifies and extends highly expressive kernels and fast exact inference techniques. Without human intervention – no hand crafting of kernel ...
متن کاملAdaptive Signal Detection in Auto-Regressive Interference with Gaussian Spectrum
A detector for the case of a radar target with known Doppler and unknown complex amplitude in complex Gaussian noise with unknown parameters has been derived. The detector assumes that the noise is an Auto-Regressive (AR) process with Gaussian autocorrelation function which is a suitable model for ground clutter in most scenarios involving airborne radars. The detector estimates the unknown...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computers & Geosciences
دوره 37 شماره
صفحات -
تاریخ انتشار 2011